Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
IEEE Trans Vis Comput Graph ; 28(1): 140-150, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34596551

RESUMO

The combination of diverse data types and analysis tasks in genomics has resulted in the development of a wide range of visualization techniques and tools. However, most existing tools are tailored to a specific problem or data type and offer limited customization, making it challenging to optimize visualizations for new analysis tasks or datasets. To address this challenge, we designed Gosling-a grammar for interactive and scalable genomics data visualization. Gosling balances expressiveness for comprehensive multi-scale genomics data visualizations with accessibility for domain scientists. Our accompanying JavaScript toolkit called Gosling.js provides scalable and interactive rendering. Gosling.js is built on top of an existing platform for web-based genomics data visualization to further simplify the visualization of common genomics data formats. We demonstrate the expressiveness of the grammar through a variety of real-world examples. Furthermore, we show how Gosling supports the design of novel genomics visualizations. An online editor and examples of Gosling.js, its source code, and documentation are available at https://gosling.js.org.


Assuntos
Gráficos por Computador , Visualização de Dados , Animais , Gansos , Genômica , Software
3.
Nature ; 593(7858): 238-243, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33828297

RESUMO

Genome-wide association studies (GWAS) have identified thousands of noncoding loci that are associated with human diseases and complex traits, each of which could reveal insights into the mechanisms of disease1. Many of the underlying causal variants may affect enhancers2,3, but we lack accurate maps of enhancers and their target genes to interpret such variants. We recently developed the activity-by-contact (ABC) model to predict which enhancers regulate which genes and validated the model using CRISPR perturbations in several cell types4. Here we apply this ABC model to create enhancer-gene maps in 131 human cell types and tissues, and use these maps to interpret the functions of GWAS variants. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577 genes that appear to influence multiple phenotypes through variants in enhancers that act in different cell types. In inflammatory bowel disease (IBD), causal variants are enriched in predicted enhancers by more than 20-fold in particular cell types such as dendritic cells, and ABC achieves higher precision than other regulatory methods at connecting noncoding variants to target genes. These variant-to-function maps reveal an enhancer that contains an IBD risk variant and that regulates the expression of PPIF to alter the membrane potential of mitochondria in macrophages. Our study reveals principles of genome regulation, identifies genes that affect IBD and provides a resource and generalizable strategy to connect risk variants of common diseases to their molecular and cellular functions.


Assuntos
Elementos Facilitadores Genéticos/genética , Predisposição Genética para Doença , Variação Genética/genética , Genoma Humano/genética , Estudo de Associação Genômica Ampla , Doenças Inflamatórias Intestinais/genética , Linhagem Celular , Cromossomos Humanos Par 10/genética , Ciclofilinas/genética , Células Dendríticas , Feminino , Humanos , Macrófagos/metabolismo , Masculino , Mitocôndrias/metabolismo , Especificidade de Órgãos/genética , Fenótipo
4.
IEEE Trans Vis Comput Graph ; 27(2): 358-368, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33026994

RESUMO

Small multiples are miniature representations of visual information used generically across many domains. Handling large numbers of small multiples imposes challenges on many analytic tasks like inspection, comparison, navigation, or annotation. To address these challenges, we developed a framework and implemented a library called PILlNG.JS for designing interactive piling interfaces. Based on the piling metaphor, such interfaces afford flexible organization, exploration, and comparison of large numbers of small multiples by interactively aggregating visual objects into piles. Based on a systematic analysis of previous work, we present a structured design space to guide the design of visual piling interfaces. To enable designers to efficiently build their own visual piling interfaces, PILlNG.JS provides a declarative interface to avoid having to write low-level code and implements common aspects of the design space. An accompanying GUI additionally supports the dynamic configuration of the piling interface. We demonstrate the expressiveness of PILlNG.JS with examples from machine learning, immunofluorescence microscopy, genomics, and public health.


Assuntos
Gráficos por Computador , Genômica
5.
IEEE Trans Vis Comput Graph ; 26(1): 611-621, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31442989

RESUMO

We present Scalable Insets, a technique for interactively exploring and navigating large numbers of annotated patterns in multiscale visualizations such as gigapixel images, matrices, or maps. Exploration of many but sparsely-distributed patterns in multiscale visualizations is challenging as visual representations change across zoom levels, context and navigational cues get lost upon zooming, and navigation is time consuming. Our technique visualizes annotated patterns too small to be identifiable at certain zoom levels using insets, i.e., magnified thumbnail views of the annotated patterns. Insets support users in searching, comparing, and contextualizing patterns while reducing the amount of navigation needed. They are dynamically placed either within the viewport or along the boundary of the viewport to offer a compromise between locality and context preservation. Annotated patterns are interactively clustered by location and type. They are visually represented as an aggregated inset to provide scalable exploration within a single viewport. In a controlled user study with 18 participants, we found that Scalable Insets can speed up visual search and improve the accuracy of pattern comparison at the cost of slower frequency estimation compared to a baseline technique. A second study with 6 experts in the field of genomics showed that Scalable Insets is easy to learn and provides first insights into how Scalable Insets can be applied in an open-ended data exploration scenario.


Assuntos
Gráficos por Computador , Curadoria de Dados , Interface Usuário-Computador , Adolescente , Adulto , Algoritmos , Bases de Dados Factuais , Feminino , Genômica , Humanos , Masculino , Mapas como Assunto , Análise e Desempenho de Tarefas , Adulto Jovem
6.
Comput Graph Forum ; 39(3): 167-179, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34334852

RESUMO

We present Peax, a novel feature-based technique for interactive visual pattern search in sequential data, like time series or data mapped to a genome sequence. Visually searching for patterns by similarity is often challenging because of the large search space, the visual complexity of patterns, and the user's perception of similarity. For example, in genomics, researchers try to link patterns in multivariate sequential data to cellular or pathogenic processes, but a lack of ground truth and high variance makes automatic pattern detection unreliable. We have developed a convolutional autoencoder for unsupervised representation learning of regions in sequential data that can capture more visual details of complex patterns compared to existing similarity measures. Using this learned representation as features of the sequential data, our accompanying visual query system enables interactive feedback-driven adjustments of the pattern search to adapt to the users' perceived similarity. Using an active learning sampling strategy, Peax collects user-generated binary relevance feedback. This feedback is used to train a model for binary classification, to ultimately find other regions that exhibit patterns similar to the search target. We demonstrate Peax's features through a case study in genomics and report on a user study with eight domain experts to assess the usability and usefulness of Peax. Moreover, we evaluate the effectiveness of the learned feature representation for visual similarity search in two additional user studies. We find that our models retrieve significantly more similar patterns than other commonly used techniques.

7.
Genome Biol ; 19(1): 125, 2018 08 24.
Artigo em Inglês | MEDLINE | ID: mdl-30143029

RESUMO

We present HiGlass, an open source visualization tool built on web technologies that provides a rich interface for rapid, multiplex, and multiscale navigation of 2D genomic maps alongside 1D genomic tracks, allowing users to combine various data types, synchronize multiple visualization modalities, and share fully customizable views with others. We demonstrate its utility in exploring different experimental conditions, comparing the results of analyses, and creating interactive snapshots to share with collaborators and the broader public. HiGlass is accessible online at http://higlass.io and is also available as a containerized application that can be run on any platform.


Assuntos
Mapeamento Cromossômico , Genoma , Internet , Interface Usuário-Computador
8.
Stem Cell Reports ; 10(1): 1-6, 2018 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-29320760

RESUMO

Unambiguous cell line authentication is essential to avoid loss of association between data and cells. The risk for loss of references increases with the rapidity that new human pluripotent stem cell (hPSC) lines are generated, exchanged, and implemented. Ideally, a single name should be used as a generally applied reference for each cell line to access and unify cell-related information across publications, cell banks, cell registries, and databases and to ensure scientific reproducibility. We discuss the needs and requirements for such a unique identifier and implement a standard nomenclature for hPSCs, which can be automatically generated and registered by the human pluripotent stem cell registry (hPSCreg). To avoid ambiguities in PSC-line referencing, we strongly urge publishers to demand registration and use of the standard name when publishing research based on hPSC lines.


Assuntos
Bancos de Espécimes Biológicos , Bases de Dados Factuais , Células-Tronco Pluripotentes , Sistema de Registros , Terminologia como Assunto , Humanos
9.
Bioinformatics ; 34(7): 1200-1207, 2018 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-29186292

RESUMO

Motivation: The ever-increasing number of biomedical datasets provides tremendous opportunities for re-use but current data repositories provide limited means of exploration apart from text-based search. Ontological metadata annotations provide context by semantically relating datasets. Visualizing this rich network of relationships can improve the explorability of large data repositories and help researchers find datasets of interest. Results: We developed SATORI-an integrative search and visual exploration interface for the exploration of biomedical data repositories. The design is informed by a requirements analysis through a series of semi-structured interviews. We evaluated the implementation of SATORI in a field study on a real-world data collection. SATORI enables researchers to seamlessly search, browse and semantically query data repositories via two visualizations that are highly interconnected with a powerful search interface. Availability and implementation: SATORI is an open-source web application, which is freely available at http://satori.refinery-platform.org and integrated into the Refinery Platform. Contact: nils@hms.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Ontologias Biológicas , Biologia Computacional/métodos , Metadados , Software , Animais , Humanos , Internet , Semântica
10.
IEEE Trans Vis Comput Graph ; 24(1): 522-531, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28866592

RESUMO

This paper presents an interactive visualization interface-HiPiler-for the exploration and visualization of regions-of-interest in large genome interaction matrices. Genome interaction matrices approximate the physical distance of pairs of regions on the genome to each other and can contain up to 3 million rows and columns with many sparse regions. Regions of interest (ROIs) can be defined, e.g., by sets of adjacent rows and columns, or by specific visual patterns in the matrix. However, traditional matrix aggregation or pan-and-zoom interfaces fail in supporting search, inspection, and comparison of ROIs in such large matrices. In HiPiler, ROIs are first-class objects, represented as thumbnail-like "snippets". Snippets can be interactively explored and grouped or laid out automatically in scatterplots, or through dimension reduction methods. Snippets are linked to the entire navigable genome interaction matrix through brushing and linking. The design of HiPiler is based on a series of semi-structured interviews with 10 domain experts involved in the analysis and interpretation of genome interaction matrices. We describe six exploration tasks that are crucial for analysis of interaction matrices and demonstrate how HiPiler supports these tasks. We report on a user study with a series of data exploration sessions with domain experts to assess the usability of HiPiler as well as to demonstrate respective findings in the data.


Assuntos
Gráficos por Computador , Genômica/métodos , Processamento de Imagem Assistida por Computador/métodos , Software , Algoritmos , Humanos
11.
Artigo em Inglês | MEDLINE | ID: mdl-31396383

RESUMO

Pattern extraction algorithms are enabling insights into the ever-growing amount of today's datasets by translating reoccurring data properties into compact representations. Yet, a practical problem arises: With increasing data volumes and complexity also the number of patterns increases, leaving the analyst with a vast result space. Current algorithmic and especially visualization approaches often fail to answer central overview questions essential for a comprehensive understanding of pattern distributions and support, their quality, and relevance to the analysis task. To address these challenges, we contribute a visual analytics pipeline targeted on the pattern-driven exploration of result spaces in a semi-automatic fashion. Specifically, we combine image feature analysis and unsupervised learning to partition the pattern space into interpretable, coherent chunks, which should be given priority in a subsequent in-depth analysis. In our analysis scenarios, no ground-truth is given. Thus, we employ and evaluate novel quality metrics derived from the distance distributions of our image feature vectors and the derived cluster model to guide the feature selection process. We visualize our results interactively, allowing the user to drill down from overview to detail into the pattern space and demonstrate our techniques in two case studies on Earth observation and biomedical genomic data.

12.
Nucleic Acids Res ; 44(D1): D757-63, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26400179

RESUMO

The human pluripotent stem cell registry (hPSCreg), accessible at http://hpscreg.eu, is a public registry and data portal for human embryonic and induced pluripotent stem cell lines (hESC and hiPSC). Since their first isolation the number of hESC lines has steadily increased to over 3000 and new iPSC lines are generated in a rapidly growing number of laboratories as a result of their potentially broad applicability in biomedicine and drug testing. Many of these lines are deposited in stem cell banks, which are globally established to store tens of thousands of lines from healthy and diseased donors. The Registry provides comprehensive and standardized biological and legal information as well as tools to search and compare information from multiple hPSC sources and hence addresses a translational research need. To facilitate unambiguous identification over different resources, hPSCreg automatically creates a unique standardized name for each cell line registered. In addition to biological information, hPSCreg stores extensive data about ethical standards regarding cell sourcing and conditions for application and privacy protection. hPSCreg is the first global registry that holds both, manually validated scientific and ethical information on hPSC lines, and provides access by means of a user-friendly, mobile-ready web application.


Assuntos
Linhagem Celular , Células-Tronco Embrionárias , Células-Tronco Pluripotentes Induzidas , Sistema de Registros , Humanos , Internet
13.
BMC Genomics ; 16: 645, 2015 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-26314578

RESUMO

BACKGROUND: Identification of marker genes associated with a specific tissue/cell type is a fundamental challenge in genetic and cell research. Marker genes are of great importance for determining cell identity, and for understanding tissue specific gene function and the molecular mechanisms underlying complex diseases. RESULTS: We have developed a new bioinformatics tool called MGFM (Marker Gene Finder in Microarray data) to predict marker genes from microarray gene expression data. Marker genes are identified through the grouping of samples of the same type with similar marker gene expression levels. We verified our approach using two microarray data sets from the NCBI's Gene Expression Omnibus public repository encompassing samples for similar sets of five human tissues (brain, heart, kidney, liver, and lung). Comparison with another tool for tissue-specific gene identification and validation with literature-derived established tissue markers established functionality, accuracy and simplicity of our tool. Furthermore, top ranked marker genes were experimentally validated by reverse transcriptase-polymerase chain reaction (RT-PCR). The sets of predicted marker genes associated with the five selected tissues comprised well-known genes of particular importance in these tissues. The tool is freely available from the Bioconductor web site, and it is also provided as an online application integrated into the CellFinder platform ( http://cellfinder.org/analysis/marker ). CONCLUSIONS: MGFM is a useful tool to predict tissue/cell type marker genes using microarray gene expression data. The implementation of the tool as an R-package as well as an application within CellFinder facilitates its use.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Marcadores Genéticos , Ontologia Genética , Estudos de Associação Genética/métodos , Especificidade de Órgãos/genética , Reprodutibilidade dos Testes , Navegador
14.
Bioinformatics ; 31(5): 794-6, 2015 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-25344497

RESUMO

UNLABELLED: Advancing technologies generate large amounts of molecular and phenotypic data on cells, tissues and organisms, leading to an ever-growing detail and complexity while information retrieval and analysis becomes increasingly time-consuming. The Semantic Body Browser is a web application for intuitively exploring the body of an organism from the organ to the subcellular level and visualising expression profiles by means of semantically annotated anatomical illustrations. It is used to comprehend biological and medical data related to the different body structures while relying on the strong pattern recognition capabilities of human users. AVAILABILITY AND IMPLEMENTATION: The Semantic Body Browser is a JavaScript web application that is freely available at http://sbb.cellfinder.org. The source code is provided on https://github.com/flekschas/sbb.


Assuntos
Gráficos por Computador , Expressão Gênica , Corpo Humano , Software , Humanos , Armazenamento e Recuperação da Informação , Internet , Masculino , Semântica
15.
Nucleic Acids Res ; 42(Database issue): D950-8, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24304896

RESUMO

CellFinder (http://www.cellfinder.org) is a comprehensive one-stop resource for molecular data characterizing mammalian cells in different tissues and in different development stages. It is built from carefully selected data sets stemming from other curated databases and the biomedical literature. To date, CellFinder describes 3394 cell types and 50 951 cell lines. The database currently contains 3055 microscopic and anatomical images, 205 whole-genome expression profiles of 194 cell/tissue types from RNA-seq and microarrays and 553 905 protein expressions for 535 cells/tissues. Text mining of a corpus of >2000 publications followed by manual curation confirmed expression information on ∼900 proteins and genes. CellFinder's data model is capable to seamlessly represent entities from single cells to the organ level, to incorporate mappings between homologous entities in different species and to describe processes of cell development and differentiation. Its ontological backbone currently consists of 204 741 ontology terms incorporated from 10 different ontologies unified under the novel CELDA ontology. CellFinder's web portal allows searching, browsing and comparing the stored data, interactive construction of developmental trees and navigating the partonomic hierarchy of cells and tissues through a unique body browser designed for life scientists and clinicians.


Assuntos
Células/metabolismo , Bases de Dados Factuais , Animais , Linhagem Celular , Fenômenos Fisiológicos Celulares , Células/citologia , Estruturas Celulares/ultraestrutura , Mineração de Dados , Perfilação da Expressão Gênica , Humanos , Internet , Rim/citologia , Fígado/citologia , Proteínas/metabolismo , RNA/metabolismo
16.
BMC Bioinformatics ; 14: 228, 2013 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-23865855

RESUMO

BACKGROUND: The need for detailed description and modeling of cells drives the continuous generation of large and diverse datasets. Unfortunately, there exists no systematic and comprehensive way to organize these datasets and their information. CELDA (Cell: Expression, Localization, Development, Anatomy) is a novel ontology for the association of primary experimental data and derived knowledge to various types of cells of organisms. RESULTS: CELDA is a structure that can help to categorize cell types based on species, anatomical localization, subcellular structures, developmental stages and origin. It targets cells in vitro as well as in vivo. Instead of developing a novel ontology from scratch, we carefully designed CELDA in such a way that existing ontologies were integrated as much as possible, and only minimal extensions were performed to cover those classes and areas not present in any existing model. Currently, ten existing ontologies and models are linked to CELDA through the top-level ontology BioTop. Together with 15.439 newly created classes, CELDA contains more than 196.000 classes and 233.670 relationship axioms. CELDA is primarily used as a representational framework for modeling, analyzing and comparing cells within and across species in CellFinder, a web based data repository on cells (http://cellfinder.org). CONCLUSIONS: CELDA can semantically link diverse types of information about cell types. It has been integrated within the research platform CellFinder, where it exemplarily relates cell types from liver and kidney during development on the one hand and anatomical locations in humans on the other, integrating information on all spatial and temporal stages. CELDA is available from the CellFinder website: http://cellfinder.org/about/ontology.


Assuntos
Células/classificação , Vocabulário Controlado , Células/metabolismo , Estruturas Celulares , Células-Tronco Embrionárias , Expressão Gênica , Humanos , Rim/citologia
17.
Database (Oxford) ; 2013: bat020, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23599415

RESUMO

Biomedical literature curation is the process of automatically and/or manually deriving knowledge from scientific publications and recording it into specialized databases for structured delivery to users. It is a slow, error-prone, complex, costly and, yet, highly important task. Previous experiences have proven that text mining can assist in its many phases, especially, in triage of relevant documents and extraction of named entities and biological events. Here, we present the curation pipeline of the CellFinder database, a repository of cell research, which includes data derived from literature curation and microarrays to identify cell types, cell lines, organs and so forth, and especially patterns in gene expression. The curation pipeline is based on freely available tools in all text mining steps, as well as the manual validation of extracted data. Preliminary results are presented for a data set of 2376 full texts from which >4500 gene expression events in cell or anatomical part have been extracted. Validation of half of this data resulted in a precision of ~50% of the extracted data, which indicates that we are on the right track with our pipeline for the proposed task. However, evaluation of the methods shows that there is still room for improvement in the named-entity recognition and that a larger and more robust corpus is needed to achieve a better performance for event extraction. Database URL: http://www.cellfinder.org/


Assuntos
Biologia Computacional/métodos , Mineração de Dados , Regulação da Expressão Gênica , Rim/citologia , Rim/metabolismo , Publicações , Software , Bases de Dados como Assunto , Humanos , Rim/anatomia & histologia , Reprodutibilidade dos Testes , Estatística como Assunto
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...